Elasticutor: Rapid Elasticity for Realtime Stateful Stream Processing
نویسندگان
چکیده
Elasticity is highly desirable for stream processing systems to guarantee low latency against workload dynamics, such as surges in data arrival rate and fluctuations in data distribution. Existing systems achieve elasticity following a resource-centric approach that uses dynamic key partitioning across the parallel instances, i.e. executors, to balance the workload and scale operators. However, such operator-level key repartitioning needs global synchronization and prohibits rapid elasticity. To address this problem, we propose an executor-centric approach, whose core idea is to avoid operator-level key repartitioning while implementing each executor as the building block of elasticity. Following this new approach, we design the Elasticutor framework with two level of optimizations: i) a novel implementation of executors, i.e., elastic executors, that perform elastic multi-core execution via efficient intra-executor load balancing and executor scaling and ii) a global model-based scheduler that dynamically allocates CPU cores to executors based on the instantaneous workloads. We implemented a prototype of Elasticutor and conducted extensive experiments. Our results show that Elasticutor doubles the throughput and achieves an average processing latency up to 2 orders of magnitude lower than previous methods, for a dynamic workload of real-world applications.
منابع مشابه
When Stream Processing crosses MapReduce
Although Event Stream Processing (ESP) systems exit for already more than a decade, we recently witness a true renaisance for ESP systems that have adopted the popular MapReduce paradigm. In this white paper, we advocate for the StreamMapReduce approach as it allows a (i) quick and easy transition of legacy MapReduce-based applications to ESP, (ii) simplifies the implementation of fault toleran...
متن کاملElastic and Secure Energy Forecasting in Cloud Environments
Although cloud computing offers many advantages with regards to adaption of resources, we witness either a strong resistance or a very slow adoption to those new offerings. One reason for the resistance is that (i) many technologies such as stream processing systems still lack of appropriate mechanisms for elasticity in order to fully harness the power of the cloud, and (ii) do not provide mech...
متن کاملEfficient Migration of Very Large Distributed State for Scalable Stream Processing
Any scalable stream data processing engine must handle the dynamic nature of data streams and it must quickly react to every fluctuation in the data rate. Many systems successfully address data rate spikes through resource elasticity and dynamic load balancing. The main challenge is the presence of stateful operators because their internal, mutable state must be scaled out while assuring fault-...
متن کاملOptimal Operator State Migration for Elastic Data Stream Processing
A cloud-based data stream management system (DSMS) handles fast data by utilizing the massively parallel processing capabilities of the underlying platform. An important property of such a DSMS is elasticity, meaning that nodes can be dynamically added to or removed from an application to match the latter’s workload, which may fluctuate in an unpredictable manner. For an application involving s...
متن کاملStateful Scalable Stream Processing at LinkedIn
Distributed stream processing systems need to support stateful processing, recover quickly from failures to resume such processing, and reprocess an entire data stream quickly. We present Apache Samza, a distributed system for stateful and fault-tolerant stream processing. Samza utilizes a partitioned local state along with a low-overhead background changelog mechanism, allowing it to scale to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.01046 شماره
صفحات -
تاریخ انتشار 2017